Speeding Up Logistic Model Tree Induction

نویسندگان

  • Marc Sumner
  • Eibe Frank
  • Mark A. Hall
چکیده

Logistic Model Trees have been shown to be very accurate and compact classifiers [8]. Their greatest disadvantage is the computational complexity of inducing the logistic regression models in the tree. We address this issue by using the AIC criterion [1] instead of crossvalidation to prevent overfitting these models. In addition, a weight trimming heuristic is used which produces a significant speedup. We compare the training time and accuracy of the new induction process with the original one on various datasets and show that the training time often decreases while the classification accuracy diminishes only slightly.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Stepwise Induction of Logistic Model Trees

In statistics, logistic regression is a regression model to predict a binomially distributed response variable. Recent research has investigated the opportunity of combining logistic regression with decision tree learners. Following this idea, we propose a novel Logistic Model Tree induction system, SILoRT, which induces trees with two types of nodes: regression nodes, which perform only univar...

متن کامل

Logistic Model Tree With Modified AIC

Logistic Model Trees have been shown to be very accurate and compact classifiers. Their greatest disadvantage is the computational complexity of inducing the logistic regression models in the tree. This issue is addressed by using the modified AIC criterion instead of crossvalidation to prevent overfitting these models. In addition, to fill the missing values, mean and mode are used class wise ...

متن کامل

Ranking stocks of listed companies on Tehran stock exchange using a hybrid model of decision tree and logistic regression

Much research has introduced linear or nonlinear models using statistical models and machine learning tools in artificial intelligence to estimate Iran's rate of return. The primary purpose of these methods is simultaneously use different independent variables to improve stock return rates' modeling. However, in predicting the rate of return, in addition to the modeling method, the degree of co...

متن کامل

A Decision Tree Approach to Predicting Recidivism in Domestic Violence

Domestic violence (DV) is a global social and public health issue that is highly gendered. Being able to accurately predict DV recidivism, i.e., re-offending of a previously convicted offender, can speed up and improve risk assessment procedures for police and front-line agencies, better protect victims of DV, and potentially prevent future reoccurrences of DV. Previous work in DV recidivism ha...

متن کامل

Speeding up Training with Tree Kernels for Node Relation Labeling

We present a method for speeding up the calculation of tree kernels during training. The calculation of tree kernels is still heavy even with efficient dynamic programming (DP) procedures. Our method maps trees into a small feature space where the inner product, which can be calculated much faster, yields the same value as the tree kernel for most tree pairs. The training is sped up by using th...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2005